Search CORE

19 research outputs found

Disinformation Capabilities of Large Language Models

Author: Bielikova Maria
Macko Dominik
Moro Robert
Pikuliak Matúš
Srba Ivan
Vykopal Ivan
Publication venue
Publication date: 15/11/2023
Field of study

Automated disinformation generation is often listed as one of the risks of large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic consequences for democratic societies around the world. This paper presents a comprehensive study of the disinformation capabilities of the current generation of LLMs to generate false news articles in English language. In our study, we evaluated the capabilities of 10 LLMs using 20 disinformation narratives. We evaluated several aspects of the LLMs: how well they are at generating news articles, how strongly they tend to agree or disagree with the disinformation narratives, how often they generate safety warnings, etc. We also evaluated the abilities of detection models to detect these articles as LLM-generated. We conclude that LLMs are able to generate convincing news articles that agree with dangerous disinformation narratives

arXiv.org e-Print Archive

Automated, not Automatic: Needs and Practices in European Fact-checking Organizations as a basis for Designing Human-centered AI Systems

Author: Bielikova Maria
Hrckova Andrea
Moro Robert
Simko Jakub
Srba Ivan
Publication venue
Publication date: 22/11/2022
Field of study

To mitigate the negative effects of false information more effectively, the development of automated AI (artificial intelligence) tools assisting fact-checkers is needed. Despite the existing research, there is still a gap between the fact-checking practitioners' needs and pains and the current AI research. We aspire to bridge this gap by employing methods of information behavior research to identify implications for designing better human-centered AI-based supporting tools. In this study, we conducted semi-structured in-depth interviews with Central European fact-checkers. The information behavior and requirements on desired supporting tools were analyzed using iterative bottom-up content analysis, bringing the techniques from grounded theory. The most significant needs were validated with a survey extended to fact-checkers from across Europe, in which we collected 24 responses from 20 European countries, i.e., 62% active European IFCN (International Fact-Checking Network) signatories. Our contributions are theoretical as well as practical. First, by being able to map our findings about the needs of fact-checking organizations to the relevant tasks for AI research, we have shown that the methods of information behavior research are relevant for studying the processes in the organizations and that these methods can be used to bridge the gap between the users and AI researchers. Second, we have identified fact-checkers' needs and pains focusing on so far unexplored dimensions and emphasizing the needs of fact-checkers from Central and Eastern Europe as well as from low-resource language groups which have implications for development of new resources (datasets) as well as for the focus of AI research in this domain.Comment: 41 pages, 13 figures, 1 table, 2 annexe

arXiv.org e-Print Archive

Is it indeed bigger better? The comprehensive study of claim detection LMs applied for disinformation tackling

Author: Hyben Martin
Kula Sebastian
Moro Robert
Simko Jakub
Srba Ivan
Publication venue
Publication date: 10/11/2023
Field of study

This study compares the performance of (1) fine-tuned models and (2) extremely large language models on the task of check-worthy claim detection. For the purpose of the comparison we composed a multilingual and multi-topical dataset comprising texts of various sources and styles. Building on this, we performed a benchmark analysis to determine the most general multilingual and multi-topical claim detector. We chose three state-of-the-art models in the check-worthy claim detection task and fine-tuned them. Furthermore, we selected three state-of-the-art extremely large language models without any fine-tuning. We made modifications to the models to adapt them for multilingual settings and through extensive experimentation and evaluation. We assessed the performance of all the models in terms of accuracy, recall, and F1-score in in-domain and cross-domain scenarios. Our results demonstrate that despite the technological progress in the area of natural language processing, the models fine-tuned for the task of check-worthy claim detection still outperform the zero-shot approaches in a cross-domain settings.Comment: 27 pages, 10 figure

arXiv.org e-Print Archive

Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

Author: Bontcheva Kalina
Heppell Freddy
Leite Joao A.
Razuvayevskaya Olesya
Scarton Carolina
Song Xingyi
Srba Ivan
Wu Ben
Publication venue
Publication date: 14/08/2023
Field of study

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks

arXiv.org e-Print Archive

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

Author: Le Thai
Lee Dongwon
Macko Dominik
Moro Robert
Srba Ivan
Tripto Nafis Irtiza
Uchendu Adaku
Venkatraman Saranya
Publication venue
Publication date: 14/11/2023
Field of study

In the realm of text manipulation and linguistic transformation, the question of authorship has always been a subject of fascination and philosophical inquiry. Much like the \textbf{Ship of Theseus paradox}, which ponders whether a ship remains the same when each of its original planks is replaced, our research delves into an intriguing question: \textit{Does a text retain its original authorship when it undergoes numerous paraphrasing iterations?} Specifically, since Large Language Models (LLMs) have demonstrated remarkable proficiency in the generation of both original content and the modification of human-authored texts, a pivotal question emerges concerning the determination of authorship in instances where LLMs or similar paraphrasing tools are employed to rephrase the text. This inquiry revolves around \textit{whether authorship should be attributed to the original human author or the AI-powered tool, given the tool's independent capacity to produce text that closely resembles human-generated content.} Therefore, we embark on a philosophical voyage through the seas of language and authorship to unravel this intricate puzzle

arXiv.org e-Print Archive

Unravelling the basic concepts and intents of misbehavior in post-truth society

Author: Bieliková Mária
Blaho Radoslav
Hrčková Andrea
Móro Róbert
Návrat Pavol
Srba Ivan
Šimko Jakub
Publication venue: Biblioteca Nacional de Cuba José Martí
Publication date: 10/03/2021
Field of study

Objective: To explore the definitions and connections between the terms misinformation, disinformation, fake news, rumors, hoaxes, propaganda and related forms of misbehavior in the online environment. Anotherobjective is to infer the intent of the authors, where relevant.Design/Methodology/Approach: A conceptual analysis of three hundred fifty articles or monographies from all types of disciplines with a priority of the articles focused on terminological analysis was being utilized. A conceptual map of the terminology that is relevant to the post-truth era was created. In the case of the lack of agreement, the etymology of the terms, utilizing dictionaries, terminological databases and encyclopedias,was favored.Results/Discussion: The approach made possible to delimit the borders between the core terms of posttruth society and to classify them according to the intents of the authors: power (influence), money, fun, sexual harassment, hate/discord, ignorance, passion and socialization. These features were identified to be able to differentiate the concepts: falsity (misleadingness, deceptiveness, lack of verification), accuracy, completeness, currency, medium, intent and analyzable unit. The conceptual map, summarizing and visualizing our findings is attached in the article.Conclusions: We argued that disinformation and misinformation are different terms with different authors and intents in the online environment. Likewise, fake news was delimitated as species of disinformation, which is limited by the medium and financial intent. The intent of hoaxers is rather the amusement of the authors or to spread discord between different groups of society. The intent and analyzable units as statement, claim, article, message, event, story and narrative that were identified in the literature, are crucial for the understanding and communication between social (human) scientists and computer scientists in order to better detect and mitigate various types of false information.Originality/Value: The study provides a theoretical background for detecting, analyzing and mitigating false information and misbehavior

Biblioteca Nacional de Cuba José Martí: Revistas de la BNCJM

Tracing Strength of Relationships in Social Networks

Author: Ivan Srba
Mária Bieliková
Publication venue: IEEE
Publication date: 01/01/2010
Field of study

Current web is known as a space with constantly growing interactivity among its users. It is changing from the data storage into a social interaction place where people not only search interesting information, but also communicate and collaborate. Obviously, social networks are the most used places for common interaction among people. We present a method for analysis of the strength of relationships together with their evolution. This method is based on the various user activities in social networks. We evaluate our approach within the Facebook social network. 1. Introduction an

CiteSeerX

Crossref